2024-11-19 13:51:41.AIbase.
Peking University Team Releases Multimodal Model LLaVA-o1, Inference Capabilities Comparable to GPT-o1!
2024-11-19 09:54:07.AIbase.
Mistral Launches the Most Powerful Open Source Multimodal Model Pixtral Large, Upgrading Le Chat to Directly Call Flux Pro
2024-10-25 11:16:59.AIbase.
Salesforce AI Research Unveils New Multimodal Model BLIP-3-Video: Cost-Effective Video Understanding
2024-09-27 17:37:02.AIbase.
Super Powerful Multimodal Model Emu3: Understanding Images and Videos Through Next Word Prediction
2024-09-26 14:34:11.AIbase.
The Open Source Multimodal Model Molmo Can Recognize Objects in Images and Generate Accurate Descriptions
2024-08-13 08:15:52.AIbase.
Starred Over Ten Thousand! The MiniCPM-V2.6 Model of WallFacer Intelligence Tops GitHub
2024-08-02 09:04:21.AIbase.
Google Launches Powerful Multimodal Model Gemini 1.5 Pro, Outranking GPT-4o and Claude-3.5 Sonnet
2024-07-31 17:56:44.AIbase.
Shusheng · Puyu Lingbi Multimodal Model Upgrade Version 2.5 Supports Longer Contexts and Image-Video Understanding Comparable to GPT-4V
2024-07-16 10:24:06.AIbase.
Meta Unveils Massive Multimodal Model Llama 3 405B on July 23rd
2024-07-14 10:34:47.AIbase.
New Breakthrough in Video Understanding! Google Unveils Universal Video Model VideoPrism for Precise Classification, Localization, and Retrieval All in One!
2024-07-08 11:36:01.AIbase.
Translated Title: Kuaishou's Open-Source Image Generation Model Kolors Enables Text Integration into Imagery
2024-07-08 09:52:43.AIbase.
Keenon AI Unleashes New Features: Web Interface Launched with Head-Tail Frame and Camera Movement Controls
2024-07-05 16:36:50.AIbase.
Anime Enthusiast's Blessing! Domestic Anime AI YoYo Goes Viral - Customizable 2D Waifus at Will!
2024-07-04 16:07:51.AIbase.
Step Stars Unveils Three Models: Step-2 and Beyond, Emphasizing Multimodal Capabilities
2024-07-04 14:31:38.AIbase.
New Features Unveiled for Google Pixel 9: AI Integration Brings Intelligent Experience Similar to Microsoft's Recall on the Horizon!
2024-07-04 10:48:36.AIbase.
Open-Source Local Real-Time Multimodal Model Moshi: Real-Time Speech Generation with Support for Multiple Accents Moshi, an open-source, real-time, multimodal model, excels in generating speech instantaneously while accommodating various accents.
2024-07-04 10:37:00.AIbase.
HKU & ByteDance Unveil LlamaGen: Open-Source Autoregressive Text-to-Image Model, Revolutionizing Image Generation
2024-07-03 11:17:02.AIbase.
Cyber Enthusiast Connects GPT-4V to Home Camera, Attracts Millions of Viewers! In a remarkable display of technological integration, a tech-savvy individual has seamlessly connected the advanced GPT-4V AI system to their home surveillance camera, captivating an audience of millions who tuned in to witness this innovative endeavor. The fusion of AI capabilities with real-time monitoring technology has sparked widespread curiosity and interest in the potential applications of such cutting-edge technology in everyday life.
2024-07-03 10:26:37.AIbase.
Tencent's Translation AI Company TRANSAGENTS Launches, Capable of Translating Extensive Literary Content
2024-07-02 13:51:29.AIbase.